Skip to content

Fix KubernetesPodOperator 404 on pod preemption#69062

Open
ambika-garg wants to merge 1 commit into
apache:mainfrom
ambika-garg:fix-kpo-404-on-preemption
Open

Fix KubernetesPodOperator 404 on pod preemption#69062
ambika-garg wants to merge 1 commit into
apache:mainfrom
ambika-garg:fix-kpo-404-on-preemption

Conversation

@ambika-garg

@ambika-garg ambika-garg commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Key changes:

  • Sync Path (PodManager.read_pod): Retries 404s up to 3 times with exponential backoff (2s, 4s, 8s). Retries are skipped and PodNotFoundException is raised immediately if the pod was previously observed in a Running phase (or anything other than Pending).
  • Async Path (AsyncKubernetesHook.get_pod): Modified to accept an optional pod argument to inspect the pod's last known state, applying the exact same exponential backoff and phase-based safeguards.
  • State Tracking (KubernetesPodTrigger): Updated the trigger to track self.last_pod during its polling loop and pass it down to AsyncKubernetesHook.get_pod to persist state across the stateless hook calls.
  • Testing: Added tests covering the retry logic and phase-based safeguards in both test_pod_manager.py and test_kubernetes.py.

In case of an existing issue, reference it using one of the following:

closes: #59626


Was generative AI tooling used to co-author this PR?
  • [ X] Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg Bot added area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues labels Jun 26, 2026
@ambika-garg

Copy link
Copy Markdown
Contributor Author

Hi @jscheffl, could you please review my PR when you get a chance? Once you approve the code changes, I'll update the test classes accordingly.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

uv.lock on main just moved via #69081 ("Cap airbyte-api below 1.0.0 to unblock provider tests"), commit 5c4f1ab and this PR currently conflicts.

Quickest fix:

git fetch upstream main && git rebase upstream/main
rm uv.lock && uv lock
git add uv.lock && git rebase --continue
git push --force-with-lease

Automated nudge — ignore if you're not ready to rebase. This comment is updated in place on future uv.lock bumps.

@ambika-garg ambika-garg force-pushed the fix-kpo-404-on-preemption branch from 9f6fd84 to 4f930ce Compare June 27, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kubernetes Pod Operator fails with 404 errors when pods are preempted by daemonsets

1 participant